Required Packages

In the code chunk below, we use the pacman package to load the required packages and install them if necessary.

if(require(pacman)==FALSE) install.packages("pacman")
## Loading required package: pacman
pacman::p_load(tidyverse, 
               magrittr, # for the two-way pipe
               fpp2, # accuracy() function, Arima(), auto.arima(),
               astsa, # package is for the required data
               plotly, # for interactive graphs
               stargazer) 

Loading the J&J Data

jj = jj # assigning the jj data (from astsa) to an object of the same name
class(jj) # to check and see if the class is a ts object
## [1] "ts"
frequency(jj) # to check the frequency --> based on the print out (=4)
## [1] 4
jj
##           Qtr1      Qtr2      Qtr3      Qtr4
## 1960  0.710000  0.630000  0.850000  0.440000
## 1961  0.610000  0.690000  0.920000  0.550000
## 1962  0.720000  0.770000  0.920000  0.600000
## 1963  0.830000  0.800000  1.000000  0.770000
## 1964  0.920000  1.000000  1.240000  1.000000
## 1965  1.160000  1.300000  1.450000  1.250000
## 1966  1.260000  1.380000  1.860000  1.560000
## 1967  1.530000  1.590000  1.830000  1.860000
## 1968  1.530000  2.070000  2.340000  2.250000
## 1969  2.160000  2.430000  2.700000  2.250000
## 1970  2.790000  3.420000  3.690000  3.600000
## 1971  3.600000  4.320000  4.320000  4.050000
## 1972  4.860000  5.040000  5.040000  4.410000
## 1973  5.580000  5.850000  6.570000  5.310000
## 1974  6.030000  6.390000  6.930000  5.850000
## 1975  6.930000  7.740000  7.830000  6.120000
## 1976  7.740000  8.910000  8.280000  6.840000
## 1977  9.540000 10.260000  9.540000  8.729999
## 1978 11.880000 12.060000 12.150000  8.910000
## 1979 14.040000 12.960000 14.850000  9.990000
## 1980 16.200000 14.670000 16.020000 11.610000
p = autoplot(jj) + theme_bw()
ggplotly(p)

Based on the plot, we can make three observations:

  1. The data is not linear, which means that fitting a linear regression directly to this data is not prudent.
  2. The variance was not constant as it increased over time (with larger values of the EPS).
  3. The data is exhibting a seasonal pattern (fourth quarter is consistently below the values for q3 and q1).

Transforming the J&J TS

log_jj = log(jj)
p2 = autoplot(log_jj)
ggplotly(p2) # a linear regression line is probably okay (the variance is more stable with the transformation)

Time as the Independent Variable

t = time(log_jj) # time makes a decimal date from the ts (if freq  >  1)
t
##         Qtr1    Qtr2    Qtr3    Qtr4
## 1960 1960.00 1960.25 1960.50 1960.75
## 1961 1961.00 1961.25 1961.50 1961.75
## 1962 1962.00 1962.25 1962.50 1962.75
## 1963 1963.00 1963.25 1963.50 1963.75
## 1964 1964.00 1964.25 1964.50 1964.75
## 1965 1965.00 1965.25 1965.50 1965.75
## 1966 1966.00 1966.25 1966.50 1966.75
## 1967 1967.00 1967.25 1967.50 1967.75
## 1968 1968.00 1968.25 1968.50 1968.75
## 1969 1969.00 1969.25 1969.50 1969.75
## 1970 1970.00 1970.25 1970.50 1970.75
## 1971 1971.00 1971.25 1971.50 1971.75
## 1972 1972.00 1972.25 1972.50 1972.75
## 1973 1973.00 1973.25 1973.50 1973.75
## 1974 1974.00 1974.25 1974.50 1974.75
## 1975 1975.00 1975.25 1975.50 1975.75
## 1976 1976.00 1976.25 1976.50 1976.75
## 1977 1977.00 1977.25 1977.50 1977.75
## 1978 1978.00 1978.25 1978.50 1978.75
## 1979 1979.00 1979.25 1979.50 1979.75
## 1980 1980.00 1980.25 1980.50 1980.75

Fit the Model

model = lm (log_jj ~ t) # t is the ind variable and log_jj is the response
names(model)
##  [1] "coefficients"  "residuals"     "effects"       "rank"         
##  [5] "fitted.values" "assign"        "qr"            "df.residual"  
##  [9] "xlevels"       "call"          "terms"         "model"
summary(model)
## 
## Call:
## lm(formula = log_jj ~ t)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.38309 -0.08569  0.00297  0.09984  0.38016 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -3.275e+02  5.623e+00  -58.25   <2e-16 ***
## t            1.668e-01  2.854e-03   58.45   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1585 on 82 degrees of freedom
## Multiple R-squared:  0.9766, Adjusted R-squared:  0.9763 
## F-statistic:  3416 on 1 and 82 DF,  p-value: < 2.2e-16
stargazer::stargazer(model, type = "html")
Dependent variable:
log_jj
t 0.167***
(0.003)
Constant -327.548***
(5.623)
Observations 84
R2 0.977
Adjusted R2 0.976
Residual Std. Error 0.159 (df = 82)
F Statistic 3,416.473*** (df = 1; 82)
Note: p<0.1; p<0.05; p<0.01
LS0tDQp0aXRsZTogIjIzIC0gQ2xhc3MgTm90ZXMiDQphdXRob3I6ICJGYWRlbCBNIE1lZ2FoZWQiDQpkYXRlOiAiNC8xOS8yMDIxIg0Kb3V0cHV0OiANCiAgaHRtbF9kb2N1bWVudDoNCiAgICB0b2M6IHRydWUNCiAgICB0b2NfZmxvYXQ6IHRydWUNCiAgICBwYWdlZF9kZjogdHJ1ZQ0KICAgIHRoZW1lOiBzaW1wbGV4DQogICAgY29kZV9mb2xkaW5nOiBzaG93DQogICAgY29kZV9kb3dubG9hZDogdHJ1ZQ0KLS0tDQoNCmBgYHtyIHNldHVwLCBpbmNsdWRlPUZBTFNFfQ0Ka25pdHI6Om9wdHNfY2h1bmskc2V0KGVjaG8gPSBUUlVFKQ0KYGBgDQoNCiMgUmVxdWlyZWQgUGFja2FnZXMNCkluIHRoZSBjb2RlIGNodW5rIGJlbG93LCB3ZSB1c2UgdGhlIGBwYWNtYW5gIHBhY2thZ2UgdG8gbG9hZCB0aGUgcmVxdWlyZWQgcGFja2FnZXMgYW5kIGluc3RhbGwgdGhlbSBpZiBuZWNlc3NhcnkuDQoNCmBgYHtyIHBhY2thZ2VzfQ0KaWYocmVxdWlyZShwYWNtYW4pPT1GQUxTRSkgaW5zdGFsbC5wYWNrYWdlcygicGFjbWFuIikNCnBhY21hbjo6cF9sb2FkKHRpZHl2ZXJzZSwgDQogICAgICAgICAgICAgICBtYWdyaXR0ciwgIyBmb3IgdGhlIHR3by13YXkgcGlwZQ0KICAgICAgICAgICAgICAgZnBwMiwgIyBhY2N1cmFjeSgpIGZ1bmN0aW9uLCBBcmltYSgpLCBhdXRvLmFyaW1hKCksDQogICAgICAgICAgICAgICBhc3RzYSwgIyBwYWNrYWdlIGlzIGZvciB0aGUgcmVxdWlyZWQgZGF0YQ0KICAgICAgICAgICAgICAgcGxvdGx5LCAjIGZvciBpbnRlcmFjdGl2ZSBncmFwaHMNCiAgICAgICAgICAgICAgIHN0YXJnYXplcikgDQpgYGANCg0KIyBMb2FkaW5nIHRoZSBKJkogRGF0YQ0KYGBge3Igamp9DQpqaiA9IGpqICMgYXNzaWduaW5nIHRoZSBqaiBkYXRhIChmcm9tIGFzdHNhKSB0byBhbiBvYmplY3Qgb2YgdGhlIHNhbWUgbmFtZQ0KY2xhc3MoamopICMgdG8gY2hlY2sgYW5kIHNlZSBpZiB0aGUgY2xhc3MgaXMgYSB0cyBvYmplY3QNCmZyZXF1ZW5jeShqaikgIyB0byBjaGVjayB0aGUgZnJlcXVlbmN5IC0tPiBiYXNlZCBvbiB0aGUgcHJpbnQgb3V0ICg9NCkNCmpqDQpwID0gYXV0b3Bsb3QoamopICsgdGhlbWVfYncoKQ0KZ2dwbG90bHkocCkNCmBgYA0KDQpCYXNlZCBvbiB0aGUgcGxvdCwgd2UgY2FuIG1ha2UgdGhyZWUgb2JzZXJ2YXRpb25zOiAgDQoNCiAgKDEpIFRoZSBkYXRhIGlzIG5vdCBsaW5lYXIsIHdoaWNoIG1lYW5zIHRoYXQgZml0dGluZyBhIGxpbmVhciByZWdyZXNzaW9uIGRpcmVjdGx5IHRvIHRoaXMgZGF0YSBpcyBub3QgcHJ1ZGVudC4gIA0KICAoMikgVGhlIHZhcmlhbmNlIHdhcyBub3QgY29uc3RhbnQgYXMgaXQgaW5jcmVhc2VkIG92ZXIgdGltZSAod2l0aCBsYXJnZXIgdmFsdWVzIG9mIHRoZSBFUFMpLiAgDQogICgzKSBUaGUgZGF0YSBpcyBleGhpYnRpbmcgYSBzZWFzb25hbCBwYXR0ZXJuIChmb3VydGggcXVhcnRlciBpcyBjb25zaXN0ZW50bHkgYmVsb3cgdGhlIHZhbHVlcyBmb3IgcTMgYW5kIHExKS4NCiAgDQoNCiMgVHJhbnNmb3JtaW5nIHRoZSBKJkogVFMNCg0KYGBge3IgbG9ndHJhbnNmb3JtfQ0KbG9nX2pqID0gbG9nKGpqKQ0KcDIgPSBhdXRvcGxvdChsb2dfamopDQpnZ3Bsb3RseShwMikgIyBhIGxpbmVhciByZWdyZXNzaW9uIGxpbmUgaXMgcHJvYmFibHkgb2theSAodGhlIHZhcmlhbmNlIGlzIG1vcmUgc3RhYmxlIHdpdGggdGhlIHRyYW5zZm9ybWF0aW9uKQ0KYGBgDQoNCg0KIyBUaW1lIGFzIHRoZSBJbmRlcGVuZGVudCBWYXJpYWJsZQ0KYGBge3IgZXh0cmFjdGluZ1RpbWV9DQp0ID0gdGltZShsb2dfamopICMgdGltZSBtYWtlcyBhIGRlY2ltYWwgZGF0ZSBmcm9tIHRoZSB0cyAoaWYgZnJlcSAgPiAgMSkNCnQNCmBgYA0KDQojIEZpdCB0aGUgTW9kZWwNCg0KYGBge3IgbW9kZWx9DQptb2RlbCA9IGxtIChsb2dfamogfiB0KSAjIHQgaXMgdGhlIGluZCB2YXJpYWJsZSBhbmQgbG9nX2pqIGlzIHRoZSByZXNwb25zZQ0KbmFtZXMobW9kZWwpDQpzdW1tYXJ5KG1vZGVsKQ0KYGBgDQoNCmBgYHtyIG1vZGVsUHJpbnQsIHJlc3VsdHM9J2FzaXMnfQ0Kc3RhcmdhemVyOjpzdGFyZ2F6ZXIobW9kZWwsIHR5cGUgPSAiaHRtbCIpDQpgYGA=